R Quick Intro Part 1: Basic Data Manipulation
Here comes a closer look at the “Tidyverse”, a unified approach to data wrangling in R.
It replaces some of the base R routines and provides a consistent interface to often-used functions.
The documentation of the tidyverse stack of packages can be found here.
At first we load the tidyverse stack of packages. Not all of the total 29 packages are loaded automatically, e.g. package readxl has to be loaded manually:
library(tidyverse)
library(readxl)
library(dygraphs)We also load the library dygraphs for chart output.
Now we can read in an Excel file with function read_xl():
data <- read_excel("Data/sales.xls")
str(data)tibble [7,085 x 11] (S3: tbl_df/tbl/data.frame)
$ ID : num [1:7085] 2004 2005 2006 2007 2008 ...
$ Sales : num [1:7085] 2691.9 910.5 69.2 22.6 4141.7 ...
$ Cost : num [1:7085] 1442 639.1 71.5 22.2 1184.6 ...
$ Category: chr [1:7085] "Spices" "Spices" "Fruit" "Fruit" ...
$ Product : chr [1:7085] "Saffron" "Saffron" "Plums" "Plums" ...
$ SaleDate: POSIXct[1:7085], format: "2008-12-28" "2008-12-28" ...
$ Quarter : chr [1:7085] "Q4" "Q4" "Q4" "Q4" ...
$ Year : num [1:7085] 2008 2008 2008 2008 2008 ...
$ SalesRep: chr [1:7085] "Jessie O'Brien" "Jessie O'Brien" "Jessie O'Brien" "Jessie O'Brien" ...
$ Region : chr [1:7085] "Northeast" "Northeast" "Northeast" "Northeast" ...
$ State : chr [1:7085] "New Jersey" "New Jersey" "New Jersey" "New Jersey" ...
The results, as that of most basic tidyverse functions, is a “tibble”, an enhanced data frame.
One of the most basic operations on data is sorting. The tidyverse function is aptly called arrange():
Have a look on different sortings.